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Inspired by the analysis of several empirical online social networks, we propose a simple 
reaction-diffusion-like coevolving model, in which individuals are activated to create links based on their 
states, influenced by local dynamics and their own intention. It is shown that the model can reproduce the 
remarkable properties observed in empirical online social networks; in particular, the assortative 
coefficients are neutral or negative, and the power law exponents 7 are smaller than 2. Moreover, we 
demonstrate that, under appropriate conditions, the model network naturally makes transition(s) from 
assortative to disassortative, and from sparse to dense in their characteristics. The model is useful in 
understanding the formation and evolution of online social networks. 



Massive websites - Facebook, Twitter, MySpace, Linkedin, Flickr, Orkut, Google-f, Weaklink, just to 
name a few - are booming in the past few years, where millions of users and their interactions naturally 
form the so called online social networks (OSNs)' For OSNs, one important characteristic is the strong 
interplay between the user behaviour and the network topology''. On the one hand, the user behaviour is affected 
by the topology-dependent information flowing in the networks"""; on the other hand, the network topology 
continually evolves as a natural consequence of network dynamics" '°. Due to this feature, OSNs exhibit certain 
correlation patterns during evolution, such as the highly skewed degree distributions" the generalized Gibrat's 
Law'"*, assortativity/disassortativity"''"', etc, which are of great importance for us to understand the possible 
generic laws governing the organization and evolution in networked systems'\ 

Recently, two interesting phenomena in OSNs have attracted much attention. The first one is related to the 
assortativity/disassortativity property of the network, which is an important structural measure characterizing 
the degree correlation between pairwise nodes. Mathematically, the assortative coefficient can be defined as the 
Pearson correlation coefficient averaged for all pairs of adjacent nodes in the network. As shown in Table I, it is 
reported that some OSNs (e.g., Twitter and Cyworld) show negative or neutral assortative coefficients" 
and some OSNs, such as Weaklink" and Google + (G + )'^, even convert from being assortative to being dis- 
assortative during evolution. These findings challenge our traditional knowledge^"'^' that biological and technical 
networks (e.g., financial networks^"') are disassortative, while social networks (e.g., acquaintance networks^'') are 
assortative. Secondly, the scale-free property is of great importance for a network, which can be characterized by a 
power law exponent y as in p(k) ~ fc"'', where k and p{k) are node degree and the distribution of degree, 
respectively. Under the thermodynamical limit, i.e., the network size AT — > 00, the mean degree of a scale-free 
network will diverge when }' £ 2. Therefore, y = 2 is an important boundary, and scale-free networks can be 
classified into dense ()' £ 2) and sparse (y > 2) accordingly. Previously, many scale-free networks are found to be 
sparse^*, flowever, as shown in Table I, some large OSNs, e.g., YouTube (YT), Digg, and Livejournal (LJ), turn out 
to be dense scale-free networks with y < 2''''''^-^^. 

In Table I, the basic statistical properties for 14 popular OSNs are listed. It is found that these OSNs basically 
share common properties observed in real world networks, such as power-law distribution of degrees, large 
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Table I | Properties of typical OSNs, including the number of nodes N, the average degree (k), the average shortest path (c^, the exponent 
of power law for out-degree (in-degree) y'out ()'/n)/ fhe average clustering coefficient (c), and the assortative coefficient r, which is defined 
as the correlation between out-degree and in-degree as the links in OSNs are directional. The empirical data sets analyzed in this paper 
are also listed here, i.e., Flickr, FriendFeed (FFj, aNobii, and Epinions 
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clustering coefficient, and small average shortest path. However, two 
features, i.e., negative or neutral assortative coefficients and y < 2, 
also turn out to be typical. In order to obtain insights into the evolu- 
tion patterns of real OSNs, it is desirable to set up a dynamical model 
which could reproduce the properties and dynamics observed in real 
OSNs. Previously, the power law distribution of degrees''^'^''"'"' and 
the disassortative correlation"'^'^ have been separately studied in 
theoretical models, and in most models the exponents )' of degree 
distributions are larger than 2 (see Table III of Ref 1 and the corres- 
ponding references). Recently, some theoretical works discussed the 
relevant properties of networks with specific functions to determine 
the degree distribution of the nodes"*" However, attentions have 
not been paid to the dynamical origin of dense and/or disassortative 
OSNs, especially the transition from assortative to disassortative 
during the evolution of real networks. 

Recently, we proposed a dynamical model based on empirical 
analysis of real OSNs such as Flickr and Epinions. It is shown that 
this simple reaction-diffusion-like model can reproduce statistical 
properties consistent with real data". In this present paper we invest- 
igate, through modeling and simulations, the two remarkable obser- 
vations that some OSNs are dense and/or disassortative. Specifically, 
based on extensive empirical analysis of real network data, including 
Flickr, FriendFeed (FF), ANobii, and Epinions, we set up an evolu- 
tion model, aiming at reproducing the above properties observed in 
OSNs. In the model, we characterize the user behaviour (local 
dynamics) in the OSNs by a state function. By considering a mech- 
anism of local interplay generating new links, i.e., the formation of 
triadic closure, we are able to describe the network evolution as a 
reaction-diffusion-like process, in which the network dynamics and 
topology evolve simultaneously and interdependently. As a natural 
consequence of the coevolution, the resulting networks exhibit the 
typical properties observed in real OSNs. Specifically, we show that 
the network is capable of making the transition from being sparse to 
dense, and from being assortative to disassortative during the evolu- 
tion. We also offer some heuristic explanation for the above beha- 
viour of the OSNs in our model. 

Although the current model shares the same framework as in Ref 
33, i.e, based on the reaction-diffusion-like local interaction pattern, 
we emphasize that there are differences between them. Firstly, the 
modeling in Ref 33 deals with typical dual-component networks 
consisting of users and items, while in this work, we only consider 
the social network, i.e, the user connections in the OSNs. Secondly, 
Ref 33 mainly investigated how the user connections, i.e., the social 
network, influence the formation of cross links (connections between 
users and items), and the dynamical correlations and patterns among 



different types of degrees. In this paper, we focus on the dynamical 
origin of the transition from assortative to disassortative, and from 
sparse to dense in the OSNs characteristics. In addition, in the cur- 
rent model, we introduce the general Fermi function to simulate the 
diversity of user dynamics, which should be more reasonable than 
the random connection in Ref 33. 

Results 

Empirical analysis. The mechanism of link formation is the central 
dynamical process during network evolution. In the seminal work, 
Barabasi and Albert proposed a general rule governing the growth of 
networks, the preferential attachment (PA), which can explain the 
scale-free properties observed in many real world networks' Since 
then, much attention has been paid to the investigation of possible 
microscopic mechanisms underlying the PA phenomenon'"^. So far, 
this important question is still open and challenging. In this paper, 
we first carry out empirical study on four typical OSNs, including 
Flickr'"'", FF'"'", aNobii'", and Epinions'' (see Methods for data 
description). Our particular interest is on the patterns of link 
creation during network evolution. 

To facilitate the analysis, we divide the new links into two mutu- 
ally-exclusive types: the balance links and the distant links based on 
the topological distance''". If a new link is formed between a user and 
one of his second neighbours, i.e., the user who is two hops apart 
from him in the network, it is regarded as a balance link""". Otherwise, 
it belongs to the distant links. Obviously, generating a balance link 
always contributes a triangle in the network. By distinguishing 
between these two types of new links, we can investigate the depend- 
ence of new links on the topological distance. 

The main method we use to analyze the pattern of link growth is to 
measure the conditional probability that nodes acquire (create) new 
links with respect to their existing in-degree (out-degree)'"'''^ (see 
Methods for details). The main empirical results for the four OSNs 
are summarized in Table II and illustrated in Fig. 1. Interestingly, the 
relative probabilities of acquiring or creating new links satisfy a 
power law with respect to the existing degrees, indicating that the 
users with larger out-degree (in-degree) are more likely to create 
(acquire) new links. Moreover, it is found that the exponents a for 
the balance links are significantly larger than that for the distant 
links, as shown in Figs. 1(a) and 1(b). This suggests that the balance 
links depend on the local topological structure more than the distant 
links. We attribute this preferential formation of balance links in the 
OSNs to the locality of information in such networks, i.e., usually 
users within a neighbourhood tend to influence each other. 
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Table II | Exponents a for empirical networks, characterizing the dependence of bo/ance links and c//sfanHinks (in the parentheses) on the 
degree and the number of common neighbours, i.e., k[x) ~ x"*'. Here for PA, xc for preferential creation, and a/s; for common 
neighbours. For comparison, exponents a for balance links in the model networks ore also listed in the brackets 
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To further examine the micro-dynamics in the process of link 
formation, we measure the probability of forming balance links with 
respect to the number of common neighbours between the source 
node and the destination node. As shown in Table II and Fig. 1 (c) , the 
probability is (approximately) linearly proportional to the number of 
common neighbours. Thus the preferential formation of balance 
links can be understood as a two-step random walk in the network. 
Here, by carefully examining the four OSNs mentioned above, we 
obtain empirical evidence that the preferential formation of triadic 
closure, i.e., the formation of balance links, can be one possible 
micro-dynamical process leading to the PA phenomenon in OSNs. 

Modelling. The above empirical analysis has shown that in the OSNs 
studied, user behaviour is essentially influenced by each other within 
the neighbourhood, and such an interplay in turn regulates the global 
evolution of the network. This suggests that local dynamics plays a 
leading role in the formation of new links during evolution. Based on 
this finding, in the following we set up a coevolving model, which is 
only driven by local interactions at the microscopic level, i.e., 
preferential formation of triadic closures*"" ""^ and influence within 
neighbourhood. For simplicity, we neglect the link directions in the 
modelling, i.e., we only consider an undirected network. 

In order to describe the dynamics of the users, we introduce a state 
function (^(z, i) for each user in the network. Here / and t denote the 
nodes and time, respectively. The values of the state functions 
describe the willingness of the users to create links. For each user 
in the network, we assume that his state function satisfies the follow- 
ing reaction-diffusion-like equation: 



(a+ 1) -</'(a) =<^o + [Ht+l)-k,{t)] 



(1) 



where two parameters ji and i^o are constants; kj{t) is the degree of 
user i at time t. The LHS of the equation is the change of state 
function with time, which is driven by two "forces": reaction and 
diffusion. The first term on the RHS, i.e., is a source term denoting 
the reaction, which means that a user can change his state on his 
own. The second term describes the diffusion process, i.e., how the 



interplay in the neighbourhood of the user i changes his state func- 
tion. Basically, if the neighbours of user ; build new links, his state 
function wiU be increased as a result of this influence. We set a 
threshold © for the state function of each user. If the state function 
exceeds the threshold, the user wiU be activated, and has a probability 
f (fc,) to actively create a new link. Once a user has buUt a new link or 
his state function has exceeded the threshold, his state ^{i, t) is reset 
to zero at the next time step. Essentially, the model simulates the user 
logins and activities in the OSNs in terms of the state functions. 

As shown in Fig. 1(b), users with more friends, i.e., with larger 
degrees, turn out to be more active in generating new links. To 
characterize the diversity of users' activities, we adopt a general 
Fermi function, which has been extensively used in evolutionary 
games models as the adaptive acceptance probabOity for each acti- 
vated user*^''"': 



1 



l-h20e-0-'"'i*-W> ■ 



(2) 



Here fc, is the degree of user /, (fc) = 2(m + 1) is determined by the 
parameter m in the model, representing the average degree of the 
whole network, and 0.001 denotes the intensity of selection. F(/c,) 
monotonically saturates to 1 with the increase of fc„ modulating the 
acceptance probability of nodes with different degrees. The para- 
meter values (20 and 0.001) do not affect the qualitative behaviour 
of the model. In this paper, we choose the parameter values to allow 
the assortative coefficient vary in a relatively wide range. We 
emphasize that the acceptance probability F{k) may take other forms 
as long as it has similar behavior as the Fermi function. 

Specifically, the algorithm to realize the model works as follows: 
(1) At the very beginning, the initial network consists of a few users 
(No), forming a small connected random network. The state func- 
tions of users in the network evolve according to Eq. (1). (2) Adding 
users: at every time step, one new user is added to the network and 
randomly connects to an existing user. (3) Adding links: at each time 
step, m users are randomly selected from the activated users with the 
acceptance probability F{k^ (Eq. (2)), and each connects to one of his 
second neighbours if they are not connected. If the number of acti- 
vated users is less than m, the remaining users are randomly chosen 
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Figure 1 | The influence of the current topological status on the formation of balance links and distant links in the aNobii and FF (in the insets) 
networks, (a) The cumulative functions of the relative probability K"'{ki„) for PA versus the in-degree of the destination nodes; (b) k°"'(A:„„,) for 
preferential creation versus the out-degree of the source nodes; (c) The cumulative functions of the relative probability k{u) for a pair of users to build a 
social link given that they have already shared u common neighbours for all balance links. The exponents are obtained by fitting the curves of K{k) 
averaged over different initial snapshots. The straight lines are guide to the eye throughout this paper. 
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Figure 2 | Schematic illustration of the coevolution of both topology (a-c), and dynamical states (d-f) in the model. The numeric tags are the values of 
the state functions, (a) The network at time t, when some users (solid) are activated according to their states, (b) One step updating of the network 
topology: New user U16 joins and randomly connects to user Ull; Some activated users connect to their second neighbours according to the acceptance 
probability e.g., U2 to U13 andU12 to U3. The arrows represent the diffusion process, (c) At time t+1, the states of users are updated according to 
equation (1). The states of activated users (U9 andUlO) and those users building new links (U2, U13, U3, U14, U11,U16) at time tare reset to 0, but some 
nodes are activated again according to their states at time t+1. (d)-(f) Illustrating the evolution of the degree and the state function for a specific user 
during certain time period in the model, (d) Evolution of the degree k{ t). (e) The activities of the user. The solid lines indicate the moments when the user 
initially increases his social degree (e.g., U2 to U13 in (b)), and the dotted lines represent the moments when the user passively increases his social degree 
(e.g., U3 was connected by U 12 in (b)), respectively, (f) Evolution of the state function (^(t). Parameters for the model: m = 10, = l,<j>Q = 0.1,& = 100. 



from the network. The above procedure is schematically illustrated in 
Fig. 2, where the states and the topology coevolve for one step driven 
by the local dynamics. As shown in Figs. 2(d)-2(f), with the increase 
of k, the period of the state <l>{i, t) for a user could become smaller, 
indicating that the users with larger degrees are more frequently 
activated. 

Verifications. In our model, although we consider only simple local 
rules as the force driving network evolution, numerical experiments 
have shown that the model can exhibit the main properties observed 
in empirical OSNs, such as the large clustering coefficient, small aver- 
age shortest path, and the power-law distributions of degrees, etc. In 
order to verify our model, we first compare the degree distributions 
of the model network with that of the empirical networks in Fig. 3. It 
is found that the distributions are qualitatively consistent with each 
other under appropriate parameters. In empirical networks, the 
probability to build a new link depends on the existing degrees, as 
shown in Fig. 1 and Table II. To compare the dynamics of our model 
with that of the empirical networks, we also applied the same analysis 
to the model under the same parameters of Fig. 3 and summarized 
the results in Table II (in the brackets). It is seen that the charac- 
teristic exponents a are qualitatively consistent with the empirical 
ones. 

We now focus on the two major properties of the model network: 
the power-law exponent )' and the assortative coefficient r. First, we 



investigate how the exponent y varies with respect to the model 
parameter m. In this work, the best power-law exponents y are cal- 
culated using the maximum likelihood method'*^. As shown in Fig. 4 
(a), for small parameter m, the distribution of degrees follows a 
stretched power law with the exponents y larger than 2; while for 
large m, the exponent }' turns out to be smaller than 2. As we know, 
many real world OSNs are characterized by y <2. The present model 
can produce this important feature in flexible parameter regimes. In 
Fig. 4 (a), we show the degree distributions for different network 
sizes. It is found that they are almost the same, indicating that the 
statistical properties of the model network are stable after long time 
evolution. We further find that, as parameter m increases, the expo- 
nents y go down across 2, as shown in Figs. 4(b) and 4(c), indicating 
that the generated network makes a transition from a sparse scale- 
free network to a dense network^''. To justify the power law fitting, we 
compute the p-value for the power law model, which measures how 
good the power law fitting is suitable for the data"*^. As shown in the 
insets of Figs. 4(b) and 4(c), the p-values are generally larger than 
0.25 and the averages are 0.60 and 0.63, respectively, indicating the 
power law model is a plausible fit to the data. 

We then investigate the assortative coefficient r in the modeP'. 
Since the links in the model are undirected, the assortative coefficient 
r is defined as the correlation between degrees of pairwise nodes. As 
shown in Figs. 5(a) and 5(b), with the increase of parameter m, 
r changes from positive to negative, indicating that the model 
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Figure 3 | Comparing the degree distributions of the empirical networks with that of the model network. Since the model network is undirected, we 
ignored the direction of links in the empirical networks for comparison, (a) Flickr, where the parameters of the model are m = 25, N = 500, 000, 
H=l,^o = 0.01, 0 = 100. (b) FriendFeed, where the parameters of the model are m = 8, N = 200, 000, ii= 1, i^o = 0-02, 0 = 200. (c) aNobii, where 
the parameters of the model are m = 7, N = 100, 000, ;i = 1, ijdo = 0.02, 0 = 180. (d) Epinions, where the parameters of the model are m = 25, 
N = 100, 000, = 1, (^0 = 0.01, 0 = 100. 



networks convert from being assortative to being disassortative. 
There are two important points to emphasize. First, as shown in 
Fig. 5(a), the change of the sign of r occurs at larger m as parameter 
^0 increases. Second, as shown in Fig. 5 (b), the value of 0 has 
significant influence on r. 

In the above, we have shown that r in the model could convert 
from positive to negative when parameter varies. As reported in Refs. 
11, 12, some OSNs convert from being assortative to being disassor- 
tative during evolution. How does r in the model behave with the 
increase of time in our model? First we note that the final network 
size N is proportional to the total evolution time. As shown in 
Fig. 5(c), the coefficients r become almost stationary when the model 
evolves for sufficiently long time. In particular, in certain parameter 
regimes, the generated networks evolve from the initial assortativity 
to the subsequent disassortativity with the increase of time. 
Therefore, the current model can characterize the distinct dynamical 
stages observed in the OSNs such as Weaklink and Google+"''''. 

The assortative to disassortative change in our model can be heur- 
isticaUy understood based on Eq. (1). Basically, it is the result of the 
competition between two factors in our model: the reaction factor 
denoted by parameter 0o> and the diffusion factor denoted by 



parameter fi. Parameter m is important because it controls the dif- 
fusion and thus can change the ratio of these two factors. When m is 
small, i.e., the number of new links formed at each time step is small, 
the local influence is weak due to the small average degree, i.e., (fc) = 
2(m + 1). In this case, the factor of reaction is relatively more 
important, and the user's own motive plays a dominant role in the 
evolution of the state function. Consequently, the activation prob- 
ability of a user is almost independent of the degree. Users thus have 
almost equal chance to be activated and connect to others, leading to 
the assortative mixing pattern. This may correspond to the situations 
in some OSNs where users tend to establish links with people they 
know in real life, resulting in assortativity in the acquaintance net- 
work during the initial stage. On the other hand, when m is large, 
according to Eq. (1), the local influence, i.e., the diffusion, then plays 
a dominant role in the evolution of state function. In this case, users 
with larger degrees have more chance to be activated and connect to 
others, leading to the disassortative mixing pattern. In real situations, 
this may correspond to certain OSNs where the celebrities attract 
their fans to connect to them. 

To further illustrate how parameter m regulates the assortative 
mixing pattern in the model network, we calculate the average 




k nt m 

Figure 4 | Transition from sparse to dense in the model network, (a) Degree distribution for m = 3 and m = 25 with different size N. (b)-(c) Power law 
exponents y with respect to parameter m for different values of <j>o (b), and for different values of 0 (c). The insets are the p-values from the maximum 
likelihood method. If not specified, the parameters in our simulations are N = 500, 000, ji = 1, (jig = 0.1, 0 = 100 throughout the paper. Results 
are averaged over 10 realizations. 
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Figure 5 | Transition from assortativity to disassortativity in the model, (a)-(b) The assortative coefBcient r with respect to the parameters m for 
different values of <j)Q (a), and for different values of 0 (b). The coefficients r are calculated at the final stage in the model when N = 500, 000. (c) The 
temporal evolution of the assortative coefficient r{N) for different mat® = 100, and for different values of 0 at m = 9. The error bar is the standard 
deviation. 



nearest neighbours' degree k""{k) in the generated networks^^. As 
shown in Fig. 6, it is seen that fc""(/c) increases with respect to degree 
k for small m, corresponding to positive assortativity in model net- 
works. This is consistent with the situation in acquaintance net- 
works^"*. However, for larger m, k""{k) increases first, and then 
decreases when the degree is large enough, corresponding to neutral 
and negative assortativity, as in some real OSNs"'''''''^ Similarly, the 
above analysis can also explain the results shown in Fig. 5(a), where a 
larger m is required for the transition of r when increases. Since i^o 
represents the reaction factor, to overcome the outcome of increasing 
^0 in the model, the diffusion factor needs to increase too. As a result 
of this competition, we observe that the transition occurs at a larger 
value of m. 

In the evolution of real OSNs, generally the average degree 
increases with time'^ '"'. This roughly corresponds to the increase of 
m in the present model due to (k) = 2{m + 1). As shown by our 
model in Fig. 5(c), this will cause the diffusion factor gradually to be 
dominant, and the network may convert from being assortative to 
being disassortative with the increase of time. Similarly, the decrease 
of parameter 0 is equivalent to the increase of parameter m, and the 
behaviour of the model in Fig. 5(c) can also be explained from the 
viewpoint of competition between reaction and diffusion factors. 

To support our argument above, we apply empirical analysis to the 
aNobii network. Specifically, we regard it as a hybrid of a real world 
social network and a virtual online network. The former subnetwork 
consists of the acquaintance links connecting users knowing each 
other in real life, e.g., their family members and friends, such as 
"Acquaintances" in Google + and "Friendship" in aNobii; and the 




10° 10' 10^ 10^ 10" 

k 



Figure 6 | Characterizing the average nearest neighbours' degree k"{k) 
for different values of the parameter m. The corresponding assortative 
coefficients are 0.21, 0.15, 0.12, 0.09, 0.07, 0.007, -0.10, -0.23, and -0.41 
for increasing m, respectively. 



latter comprises the stranger links connecting their online virtual 
friends, such as "Following" in Google+ and "Neighbourhood" in 
aNobii. In terms of the reaction-diffusion process, the generation of 
these two types of links is mainly due to the reaction factor (i.e., user's 
personal desire) and the diffusion factor (i.e., the local influence) 
respectively. Interestingly, we find that the subnetwork consisting 
of the acquaintance links is assortative with r = 0.06, like real world 
social networks. On the contrary, the subnetwork consisting of the 
stranger links is disassortative with r = —0.09. As shown in Fig. 7, the 
relative probabilities forming stranger links are significantly larger 
than that forming acquaintance links, implying that the diffusion 
factor is dominant in aNobii. As a result, the aNobii network as a 
whole turns out to be disassortative with r = —0.05. The above 
results provide empirical evidence that the competition between dif- 
fusion and reaction might determine the mixing pattern of degrees in 
an OSN. Reasonably, during the evolution of the OSNs, if the dif- 
fusion factor dominates over the reaction factor, a transition from 
assortativity to disassortativity could be expected as in Weaklink" 
and Google + '^. 

Discussion 

In this work, based on some empirical analysis of four typical OSNs, 
we set up a reaction-diffusion-like model, in which the evolution of 
the network is governed by both the users' personal motives and the 
influence within neighbourhood. As a natural consequence of the 
coevolution of dynamics and topology, the model is able to qualita- 
tively reproduce the major properties observed in real world OSNs. 
In particular, the generated networks can convert from being sparse 
to dense, and from being assortative to dissassortative with appro- 
priate parameters. The model provides explanations of these two 
important features in real world OSNs in terms of the competition 
between reaction and diffusion factors in network evolution. 

We believe that the current work is enlightening in modeling the 
evolution of the OSNs as well as of other real world networks. For 
example, other mechanisms of link formation, such as collective 
action and the structural hole mechanism, etc", can be readily for- 
mulated and investigated. The idea of the model might be applicable 
to a wide range of social networks, and can be easily generalized to 
treat multi-layer networks, weighted networks, and social-attribute 
networks, etc. For example, recently, we have carried out a modeling 
for Flickr, with a typical dual-component and dual-connection OSN, 
and obtained satisfactory results''''. 

Methods 

Data description and notations. Flickr is one of the most famous websites sharing 
photos. The data set for our study is collected by daily crawling the Flickr network 
over 2.3 million users from Nov 2, 2006 to Dec 3, 2006, and again daily from Feb 3, 
2007 to May 18, 2007. In total, there are 104 days in the time window of data 
collection'''*'''^ (http://socialnetworks.mpisws.org/). There are more than 2.3 mOlion 
users and 33 million directed links among them. FriendFeed (FF) is a content 
aggregation site where users discover and discuss the interesting contents found on 
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Figure 7 | Characterizing the difference between the acquaintance and stranger Unks in the aNobii network, (a) The cumulative functions of the relative 
probability K"Xki„) versus in-degree of destination nodes; (b) ^''"'(/Co^f) versus out-degree of source nodes, (c) The cumulative functions of the relative 
probability k{ u) for a pair of users to build a social link given that they have already shared u common neighbours. 



the web by their friends. The data set is collected by crawling the FriendFeed network 
once within every five days between Feb 26 and May 6, 2009^^'^^. There are 14 
snapshots or 70 days in the time window. More than 200 thousand users were found 
and about 4 million directed links among them were identified. ANobii is a website 
where readers can rate, review and discuss books with others. The data set is collected 
by crawling the neighbourhood (stranger in real life) and friendship (acquaintance in 
real life) networks of aNobii. Six snapshots of the network, 1 5 days apart, are collected 
starting from Sep 11, 2009^^. Users connect to each other through two mutually- 
exclusive types of ties: friendship and neighbourhood links. At last, the aNobii 
network includes 86,800 users, 429,482 stranger links and 268,655 acquaintance links. 
Epinions is a consumer review website where users can write reviews of products and 
also "trust" or "distrust" each other. The data set contains the trusted relationships 
among users before Aug 12, 2003^^ (http://www.trustlet.org/wiki/Extended_ 
Epinions_dataset), including 114,467 users and 717,667 trusted relations. 
Mathematically, we can use the adjacency matrix A^^xn to characterize the topology 
of the online social networks, where — 1 if user i declares user j as friend, otherwise 

0. Since the links among users are directional in these four networks, we accordingly 
define two types of degrees: the out-degree kout{i) — ''^^-^ijy 1-^-' the number of 
friends claimed by user /, and the in-degree fc,>j(j) — ciy, i.e., the number of users 
who claim user j as friend. The statistical properties of these four data sets are hsted in 
Table I. 

Measuring preferential attachment. In Refs. 41, 42, a numerical method is used to 
measure the preferential attachment (PA) growth of a network. Given that we know 
the temporal order in which the nodes join the network, the essential idea of the 
method is to monitor to which existing node the new nodes connect, as a function of 
the degree of the old node. We take an example to briefly explain the method as 
follows: (1) At time to, we mark the nodes with A;o„f out-degree as "^o nodes", denoting 
their number as C{kout)- (2) After the evolution of a period Af, the out-degrees of the 
"to nodes" have increased due to the evolution of the network (of course, the in- 
degrees also change). We count the out-degree created by the "to nodes" as A{ko,jt)- 
Since we divide the newly generated links into two types, i.e., the balance and the 
distant, we have A{kout) — -^Bikout) + ^Dikout)^ where the subscripts B and D denote 
the two types, respectively. (3) The histogram providing the number of out-degree 
acquired by the "^o nodes" with exact kgut out-degree, after normalization, defines a 
function: 

where, i can be either B or D. It has been proven that if PA mechanism exists, the 
conditional probability with which the out-degree grows with respect to the existing 
out-degree follows a power law, namely n^"'(fcou/) oc/c^^^. Numerically, it is 
convenient to examine the cumulative function of Yl^"' {kout)^ which will also follow a 
power law, i.e., 

Kr{kou,)= nr ^c- (4) 

Jo 

Similarly, the above numerical method can be applied to the calculation of the 
probability for acquiring new links with respect to in-degrees fcj^, and the treatment of 
the acquaintance and stranger links is straightforward. 
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